Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

This PR makes Gradient Descent parallelized using Threads.@spawn #179

Open
wants to merge 12 commits into
base: master
Choose a base branch
from

Conversation

Gertian
Copy link
Collaborator

@Gertian Gertian commented Oct 15, 2024

As the title suggest this PR makes Gradient descent parallel...

Two remarks are :

  1. I failed to parallelize :
function retract(x::ManifoldPoint{<:FiniteMPS}, g, alpha)
    #TODO : support parralelize_sites.
    state = x.state
    envs = x.envs

    y = copy(state)  # The end-point
    h = similar(g)  # The tangent at the end-point
    for i in 1:length(g)
        yal, th = Grassmann.retract(state.AL[i], g[i].Pg, alpha)
        h[i] = PrecGrad(th)
        y.AC[i] = (yal, state.CR[i])
    end
    normalize!(y)

    n_y = ManifoldPoint(y, envs)

    return n_y, h
end
  1. the tests currently don't consider n-site unit cells. To cure this I'll make some new tests in another PR which do 3 site unit cells.

@Gertian Gertian requested a review from lkdvos October 15, 2024 14:14
@Gertian
Copy link
Collaborator Author

Gertian commented Oct 15, 2024

I made the extra tests in a separate PR.

I locally combined the two and the new tests perform pass on my laptop with julia -t 3 runtests.jl

Copy link

codecov bot commented Oct 15, 2024

Codecov Report

Attention: Patch coverage is 80.85106% with 9 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/algorithms/grassmann.jl 80.85% 9 Missing ⚠️
Files with missing lines Coverage Δ
src/algorithms/grassmann.jl 93.10% <80.85%> (-6.90%) ⬇️

... and 1 file with indirect coverage changes

@Gertian
Copy link
Collaborator Author

Gertian commented Oct 15, 2024

@lkdvos I don't understand how to make codecov happy.

The new lines are/are not being tested depending on MPSKit.Defaults aren't they ?

I understand that we might want to test both scenarios but again, this is currently not implemented in any tests as far as I understand ?

ps : I've compared my codecov with that of the existing parallel VUMPS implementation. The non-covered lines are identical. However, VUMPS has 2 of these lines whereas my GD has way more...

@Gertian
Copy link
Collaborator Author

Gertian commented Oct 15, 2024

The failing macOS test seems to be unrelated to the code I've added.

src/algorithms/grassmann.jl Show resolved Hide resolved
src/algorithms/grassmann.jl Show resolved Hide resolved
src/algorithms/grassmann.jl Show resolved Hide resolved
src/algorithms/grassmann.jl Outdated Show resolved Hide resolved
@Gertian
Copy link
Collaborator Author

Gertian commented Oct 28, 2024

It seems like this PR works, and does parallelize the code.

However, it does re-introduce out-of-memory crashes that julia 1.10 had solved before. I tried to get the cluster managers to update to 1.11 to see if this fixes the problem :)

Apart from that, perhaps the issue might be solved through GC.safepoint() but I'm not sure how that works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants